A Study on Inference Control in Natural Language Processing

نویسنده

  • Takashi Miyata
چکیده

Natural language processing requires exible control of computation on various sorts of constraints such as syntax, semantics, pragmatics. This study aims to propose and verify a new approach that describes a system declaratively with constraints and controls inferences guided by general principles based on probability. This approach is an alternative one against the previous procedural approach that prepares a number of various heuristics on control. It is needed to use a various constraints exibly not only in dialogue systems but also in the sub-system such as parser. In fact, [Nagata and Morimoto] and [Maxwell and Kaplan] pointed out that in modern grammatical formalisms such as HPSG and LFG, which employ two sorts of linguistic constraints (i.e. on phrase structures and feature structures), radical e ciency improvement could be obtained by appropriate strategy for combining computations on di erent constraints. Concretely speaking, processing relatively coarser grained constraints on phrase structure rst and then other ner grained constraints such as on feature structures is more e cient in general. This phenomenon results from the di erence in computational complexity between processing these sorts of constraints. That is, syntactic parsing on a context free grammar can be performed within the polynomial time and space with respect to the size of the input sentence while a uni cation of two feature structures, which is required in processing the semantic constraints, needs the exponential time and space in worst cases with respect to the size of the input feature structures. In most grammars syntactic constraints are represented by constraints on phrase structures and semantic constraints by those among feature structures. Thus the conventional architecture that processes syntactic constraints rst and then semantic ones is consistent with this. The cost of the uni cation between feature structures, however, strongly depend on the structure of the inputs. There can be the case where the constraints on feature structures should be processed rst thanks to requiring less computation than that required in constraints on phrase structures. So the system must change the order of processing of constraints according to context in order to obtain more e ciency. The reason why dialogue systems are required to make use of various constraints exibly is similar to this. To treat this situation, previous systems mainly adopted `procedural approach' that is a sort of production system, in which a number of such heuristics that instruct `take the process of constraint A priority in context X' and control the processes according them. On the other hand, as large scale natural language processing systems have been constructed in recent years, another requirement for natural language processing systems is getting on importance: scalability. Let us represent each constraint as a logical formula which rules a relation between objects. The system of constraints that contains n variables requires at least O(2 n ) conditions in order to distinguish the possible context because each variable must be distinguished at least whether it is instantiated or not. This means that the number of possible context increases exponentially according to the number of the variables in the system of constraints (or it is roughly proportional to the number of constraints themselves). This combinatorial explosion suggests that procedural approach would be appropriate only when designing a small scale natural language processing system but is not scalable. We adopt another approach that does not prescribe control of inferences by enumerating possible contexts like procedural approach, but assigns a probability to each part of constraint and controls inferences based on general principles such as entropy minimization and expected utility maximization. First we assign the constant, which is called basic probability, to each clause in a given Horn clause program. Then we de ne the probability of the clause that represents a given goal, which we call the top clause, as the product of the basic probabilities of the clauses used in derivation of the top clause. We also de ne `explanation' as the set of the clausal instances used in the way of top-down derivations of the top clause. We also de ne the probability of the explanation similarly to the probability of the derivation. Each literal instance can belong to more than one explanations and we de ne the probability of literal instance p as the summation (or the maximum) of the probabilities of the explanations that contain p. We adopt the probability of each literal instance (or the monotonic and increasing function of the probability) as the preference of inference on that literal instance. Intuitively speaking, the inference on information consistent with probabilistically large part of the search space is given preference. In this research we rst implement Probabilistic Horn Constraint System (PHoCS), which is the system along this computational framework. PHoCS stores them as a graph, which is a packed form by structure sharing, and extends them incrementally according to the inference goes on. We call this graph a constraint network. Moreover in order to reduce the cost of calculation of the probability of each literal instance p, they are updated only when the structure of the constraint graph around p is changed by the inference. The probabilities of literal instances around p are also updated only when the di erence between the result of recalculation of the probability of p and the previous one exceeds a given threshold. This process is iterated until the probability of each literal instance converges. We have compared some de nitions of the probability of each literal instance and the preference among possible inferences by an experiment. This experiment concerns parsing sentences that have an ambiguity on modi cation of prepositional phrase preceded by object of a verb. We have measured the numbers of inference steps taken until the correct parse tree is found and that taken until all the possible trees are found. The experiment displayed that the control strategy which calculates the probability of the literal instance that belongs to more than one explanation as the sum of the probabilities of those explanations took 4.4% fewer steps than the depthrst strategy. We have also showed an algorithm that obtains the basic probability of each clause from corpora, which is an extension of Inside Outside Algorithm for context free grammar. An experiment to verify this algorithm was conducted and the basic probability of each auxiliary term was calculated so that they are proportional to the frequency of the patterns of modi cation of prepositional phrases. We have investigated the trade-o between the computational cost of learning and the quality of the resulting parameters by a corpus containing 50 sentences. The computational cost increases linearly while the quality of the resulting parameters decreases linearly against the number of literals in parse trees. The number of literals is proportional to the number of ambiguities within a sentence and the size of corpus in general. Thus the cost and the quality of our parameter learning algorithm can be seen reasonable ones. O@ J8 MW ;] <+A38@8lBPOC%7%9%F%‘$OE}8lO@!&0UL#O@!&8lMQO@$J$I$N$5$^$6$^$J@)Ls$r=@Fp$KMQ$$$F=hM}$r?J$a$kI,MW$,$"$k!# $3$NL\E*$N$?$a$KMQ$$$i$l$F$-$?!V@)8f$K4X$9$kB??t$N%R%e!<%j%9%F%#%/%9$r$"$i$+$8$aMQ0U$7$F$*$/!W$H$$ $Ne$,F@$i$l$k$3$H$,;XE&$5$l$F$$$k!#6qBNE* Hf3SE*N3EY$N9b$$6g9=B$$N@)Ls$r@h$K=hM}$7!"$=$N8e$G$=$NB>$NAG@-9=B$$N@)Ls$r=hM}$9$k$H$$$&J;}K!$,0lHL E*$G$"$k$HJs9p$5$l$F$$$k$,!"$=$NM}M3$ON>l9gF~NO$NAG@-9=B$$NBg$-$5$N;X?t%*!<%@!<$N7W;;NL$rI,MW$H$9$k!#B? K!$K$*$$$F$O!"E}8lE*$J@)Ls$O6g9=B$$N@)Ls$G!"0UL#$K4X$9$k@)Ls$OAG@-9=B$$N@)Ls$G$=$l$>$l5-=R$5$l$k$N 8lE*$J@)Ls$r=hM}$7$F$+$i0UL#E*$J@)Ls$r=hM}$9$k$H$$$&=>Mh$NJ}K!$O$"$k0UL#$G9NDj$5$l$k$3$H$K$J$k$,!" B$$NC10l2=$N7W;;NL$OF~NO$N9=B$$K6/$/0MB8$9$k$N$G!">l9g$K$h$C$F$O6g9=B$$h$j$bN3EY$,9b$$$3$H$,$"$k!# $I$N@)Ls$r@h$K=hM}$9$Y$-$+$r$"$i$+$8$a7h$a$F$*$/$3$H$O$G$-$:!">u67$^$?$OJ8L.$K1~$8$F=hM}$N=g=x$rJQ MW$,@8$8$k!#BPOC%7%9%F%‘$K$*$$$F$bB?MM$J@)Ls$r=@Fp$KMQ$$$k!V=@Fp@exibility!W$,I,MW$H$5$l$k$N$OF1MM$NM}M3 $K$h$k!#=>Mh$O$3$NLdBj$KBP=h$9$k$?$a$K@l$i!V$C$F@) $J$&$H$$$&J;}K!$G$"$k!# 0lJ}!"Bg5,LO$J<+A38@8l=hM}%7%9%F%‘$,9=C[$5$l$k$h$&$K$J$C$?6aG/$K$*$$$F=E;k$5$l$F$$$k$b$&0l$D$NMW $F!V3HD%@scalability!W$,5s$2$i$l$k!#@)Ls$r$b$N$H$b$N$H$N4V$N4X78$r5,Dj$9$k$b$N$H$7$FO@M}<0$GI=8=$9$k$3 $K$9$k$H!"J8L.$rL@<(E*$K?t$(>e$2$F3FJ8L.$K$*$$$F$I$N@)Ls$r$I$N$h$&$KMQ$$$k$+$r$"$i$+$8$a7h$a$k$?$a @)LsA4BN$G n 8D$NJQ?t$r4^$s$G$$$?$H$9$k$H!">/$J$/$H$bJQ?t$,6qBN2=$5$l$F$$$k$+$$$J$$$+$N6hJL$OI,MW $+$i!" O(2 n) 2s$N>r7oH=CG$,I,MW$G$"$k!#$9$J$o$AJ8L.$NB?MM@-$O@)Ls$NJ#;($5$K4X$7$F;X?t4X?tE*$KA}Bg$9 $G!".5,LO%7%9%F%‘$G$OM-8z$G$"$k$,Bg5,LO%7%9%F%‘$N9=C[$N$?$a$K$OITE,@Z$G$"$k$3 $+$k!# K\8&5f$G$O!"$^$: Horn @a$G=q$+$l$?%W%m%0%i%‘$KBP$7$F3F@a$K4pK\3NN($H$h$P$l$kDj?t$r3d$jEv$F!"%4!< AjEv$9$k@a (@hF,@a$H$$$&) $rF3=P$9$k$N$KMQ$$$?@a$N4pK\3NN($N@Q$r!"$=$N@hF,@a$N3NN($HDj5A$9$k!#@hF, %H%C%W%@%&%s$KF3=P$r;n$_$?;~$K!"F3=P$K;H$o$l$?@a ($N%$%s%9%?%s%9) $N=89g$r$=$N@hF,@a$KBP$9$k@bL@$H @bL@$KBP$7$F$bF3=P$N>l9g$HF1MM$K3NN($rDj5A$9$k!#0lHL$K0l$D$N%j%F%i%k$N%$%s%9%?%s%9$OJ#?t$N@bL@$KB0 $=$l$i$N@bL@$N3NN($NAmOB ($b$7$/$O:GBgCM) $r$=$N%j%F%i%k$N%$%s%9%?%s%9$N3NN($HDj5A$9$k!#$=$7$F%j%F%i %s%9%?%s%9$N3NN( ($b$7$/$O$=$NC1D4A}2C4X?t) $r$b$C$F?dO@$NM%@hEY$H$9$k$N$,K\8&5f$G$N%"%W%m!<%A$G$"$k!# E*$K$$$($P!"C5:w6u4V$NCf$G3NN($NBg$-$JItJ,$HN>N)$9$k>pJs$K4X$9$k=hM}$,M%@h$5$l$k$3$H$K$J$k!#K\8&5f $:!"$3$N%"%W%m!<%A$K1h$C$?7W;;$NOHAH$G$"$k3NN(E* Horn @a@)Ls%7%9%F%‘ Probabilistic Horn Constraint System (PHoCS) $r/$J$$?dO@%9%F%C%W?t$G!"F~NO $iM=A[$5$l$k2r$r8+$D$1$k$3$H$,$G$-$?!# $5$i$K!">e5-$N4pK\3NN($r!"M?$($i$l$?%3!<%Q%9$+$i3X=,$9$k$?$a$N%"%k%4%j%:%‘$r<($7$?!#$3$l$O3NN(E* M3J8K!$K$*$$$FMQ$$$i$l$F$$$k Inside Outside %"%k%4%j%:%‘$N3HD%$H$J$C$F$$$k!#$3$N%"%k%4%j%:%‘$NI>2Ae$2$?!#6qBNE*$K$O 50 J8$+$i$J$k%3!<%Q%9$G3X=,$K$+$+$k7W;;%3%9%H$HF@$i$l$?4pK\3NN($N<A$rD4$Y!"A0<T$, %j%F%i%k?t$K$[$\HfNc$7$FA}Bg$9$k$3$H!"$*$h$SBP?tL‘EY$K$h$C$F7W$C$?3X=,7k2L$N<A$,%j%F%i%k?t$K$[$\Hf Dc2<$9$k$3$H$r3N$+$a$?!#0lHL$K%j%F%i%k?t$O0lJ8$"$?$j$N[#Kf@-$N?t$H%3!<%Q%9$N%5%$%:$KHfNc$9$k$N$G!" %9%H!&3X=,7k2L$N<A$H$bBEEv$J$b$N$G$"$k$H8@$($k!#

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Fuzzy Inference System, Image Processing and Quality Control to Detect Defects and Classify Quality Level of Copper Rods

Human-based quality control reduces the accuracy of this process. Also, the speed of decision making in some industries is very important. For removing these limitations in human-based quality control, in this paper, the design of an expert system for automatic and intelligent quality control is investigated. In fact, using an intelligent system, the accuracy in quality control is increased. It...

متن کامل

The Impact of Contextual Clue Selection on Inference

Linguistic information can be conveyed in the form of speech and written text, but it is the content of the message that is ultimately essential for higher-level processes in language comprehension, such as making inferences and associations between text information and knowledge about the world. Linguistically, inference is the shovel that allows receivers to dig meaning out from the text with...

متن کامل

The Effect of False Correction Strategy and Inference Strategy on the Paramedical Students’ Reading Comprehension and Attitude

There is a bulk of studies supporting the positive effect of strategy instruction on reading comprehension. This study examined the effect of two reading strategies (i.e., false correction and inference strategy) on English reading comprehension of Iranian paramedical students, using a pretest, posttest, control group design. It also surveyed their attitudes toward the effect and usefulness of ...

متن کامل

First Language Activation during Second Language Lexical Processing in a Sentential Context

 Lexicalization-patterns, the way words are mapped onto concepts, differ from one language      to another. This study investigated the influence of first language (L1) lexicalization patterns on the processing of second language (L2) words in sentential contexts by both less proficient and more proficient Persian learners of English. The focus was on cases where two different senses of a polys...

متن کامل

Dynamic Mediation for Removing Language Comprehension Problems: A Psychological Support for Listening Comprehension Mental Processing

Dynamic Assessment is an approach to assessment within Applied Linguistics which is stemmed from Vygotsky’s Socio-Cultural Theory of mind, his concept of Zone of Proximal Development and Feuerstein's theory of Structural Cognitive Modifiability. This study is an attempt to pinpoint the sources of mental processing problems in listening comprehension and applies dynamic interventions to remove t...

متن کامل

ACL - 08 : HLT Software Engineering , Testing , and Quality Assurance for Natural Language Processing

Software engineering in general is a first-class research object in computer science, but generally has not been treated as such within the natural language processing community. This is despite the fact that natural language as an input type has unique characteristics that present special problems for software testing, quality assurance, and even requirements specification. The goals of this w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996